高阶交互事件在现实世界应用中很常见。从这些事件中编码参与者的复杂关系的学习嵌入在知识挖掘和预测任务中至关重要。尽管现有方法取得了成功,例如泊松张量分解,它们忽略了数据基础的稀疏结构,即发生的相互作用远小于所有参与者之间可能的相互作用。在本文中,我们提出了稀疏高阶交互事件(NESH)的非参数嵌入。我们杂交稀疏的超图(张量)过程和一个基质高斯过程,以捕获相互作用中的渐近结构稀疏性和参与者之间的非线性时间关系。我们证明了稀疏性比的强渐近边界(包括较低和上限),这揭示了采样结构的渐近特性。我们使用批界规范化,破坏性结构和稀疏的变分GP近似来开发有效的,可扩展的模型推理算法。我们在几个现实世界应用中证明了方法的优势。
translated by 谷歌翻译
Attention mechanisms form a core component of several successful deep learning architectures, and are based on one key idea: ''The output depends only on a small (but unknown) segment of the input.'' In several practical applications like image captioning and language translation, this is mostly true. In trained models with an attention mechanism, the outputs of an intermediate module that encodes the segment of input responsible for the output is often used as a way to peek into the `reasoning` of the network. We make such a notion more precise for a variant of the classification problem that we term selective dependence classification (SDC) when used with attention model architectures. Under such a setting, we demonstrate various error modes where an attention model can be accurate but fail to be interpretable, and show that such models do occur as a result of training. We illustrate various situations that can accentuate and mitigate this behaviour. Finally, we use our objective definition of interpretability for SDC tasks to evaluate a few attention model learning algorithms designed to encourage sparsity and demonstrate that these algorithms help improve interpretability.
translated by 谷歌翻译
Abstractive summarization has enjoyed renewed interest in recent years, thanks to pre-trained language models and the availability of large-scale datasets. Despite promising results, current models still suffer from generating factually inconsistent summaries, reducing their utility for real-world application. Several recent efforts attempt to address this by devising models that automatically detect factual inconsistencies in machine generated summaries. However, they focus exclusively on English, a language with abundant resources. In this work, we leverage factual consistency evaluation models to improve multilingual summarization. We explore two intuitive approaches to mitigate hallucinations based on the signal provided by a multilingual NLI model, namely data filtering and controlled generation. Experimental results in the 45 languages from the XLSum dataset show gains over strong baselines in both automatic and human evaluation.
translated by 谷歌翻译
We consider the problem of automatically generating stories in multiple languages. Compared to prior work in monolingual story generation, crosslingual story generation allows for more universal research on story planning. We propose to use Prompting Large Language Models with Plans to study which plan is optimal for story generation. We consider 4 types of plans and systematically analyse how the outputs differ for different planning strategies. The study demonstrates that formulating the plans as question-answer pairs leads to more coherent generated stories while the plan gives more control to the story creators.
translated by 谷歌翻译
In recent years the importance of Smart Healthcare cannot be overstated. The current work proposed to expand the state-of-art of smart healthcare in integrating solutions for Obsessive Compulsive Disorder (OCD). Identification of OCD from oxidative stress biomarkers (OSBs) using machine learning is an important development in the study of OCD. However, this process involves the collection of OCD class labels from hospitals, collection of corresponding OSBs from biochemical laboratories, integrated and labeled dataset creation, use of suitable machine learning algorithm for designing OCD prediction model, and making these prediction models available for different biochemical laboratories for OCD prediction for unlabeled OSBs. Further, from time to time, with significant growth in the volume of the dataset with labeled samples, redesigning the prediction model is required for further use. The whole process requires distributed data collection, data integration, coordination between the hospital and biochemical laboratory, dynamic machine learning OCD prediction mode design using a suitable machine learning algorithm, and making the machine learning model available for the biochemical laboratories. Keeping all these things in mind, Accu-Help a fully automated, smart, and accurate OCD detection conceptual model is proposed to help the biochemical laboratories for efficient detection of OCD from OSBs. OSBs are classified into three classes: Healthy Individual (HI), OCD Affected Individual (OAI), and Genetically Affected Individual (GAI). The main component of this proposed framework is the machine learning OCD prediction model design. In this Accu-Help, a neural network-based approach is presented with an OCD prediction accuracy of 86 percent.
translated by 谷歌翻译
我们为多机器人任务计划和分配问题提出了一种新的公式,该公式结合了(a)任务之间的优先关系; (b)任务的协调,允许多个机器人提高效率; (c)通过形成机器人联盟的任务合作,而单独的机器人不能执行。在我们的公式中,任务图指定任务和任务之间的关系。我们在任务图的节点和边缘上定义了一组奖励函数。这些功能对机器人联盟规模对任务绩效的影响进行建模,并结合一个任务的性能对依赖任务的影响。最佳解决此问题是NP-HARD。但是,使用任务图公式使我们能够利用最小成本的网络流量方法有效地获得近似解决方案。此外,我们还探索了一种混合整数编程方法,该方法为问题的小实例提供了最佳的解决方案,但计算上很昂贵。我们还开发了一种贪婪的启发式算法作为基准。我们的建模和解决方案方法导致任务计划,即使在与许多代理商的大型任务中,也利用任务优先关系的关系以及机器人的协调和合作来实现高级任务绩效。
translated by 谷歌翻译
深神经网络(DNN)通常被设计为依次级联的可区分块/层,其预测模块仅连接到其最后一层。 DNN可以与沿主链的多个点的预测模块相连,其中推理可以在中间阶段停止而无需通过所有模块。最后一个退出点可能会提供更好的预测错误,但还涉及更多的计算资源和延迟。就预测误差和成本而言,一个“最佳”的出口是可取的。最佳出口点可能取决于任务的潜在分布,并且可能会从一个任务类型变为另一种任务类型。在神经推断期间,实例的基础真理可能无法获得,并且每个出口点的错误率无法估算。因此,人们面临在无监督环境中选择最佳出口的问题。先前的工作在离线监督设置中解决了此问题,假设可以使用足够的标记数据来估计每个出口点的错误率并调整参数以提高准确性。但是,经过预训练的DNN通常被部署在新领域中,可能无法提供大量的地面真相。我们将退出选择的问题建模为无监督的在线学习问题,并使用匪徒理论来识别最佳出口点。具体而言,我们专注于弹性BERT,这是一种预先训练的多EXIT DNN,以证明它“几乎”满足了强大的优势(SD)属性,从而可以在不知道地面真相标签的情况下学习在线设置中的最佳出口。我们开发了名为UEE-UCB的基于上限(UCB)的上限(UCB)算法,该算法可证明在SD属性下实现了子线性后悔。因此,我们的方法提供了一种自适应学习多种exit DNN中特定于域特异性的最佳出口点的方法。我们从IMDB和Yelp数据集上进行了验证算法验证我们的算法。
translated by 谷歌翻译
DeepFake是指量身定制和合成生成的视频,这些视频现在普遍存在并大规模传播,威胁到在线可用信息的可信度。尽管现有的数据集包含不同类型的深击,但它们的生成技术各不相同,但它们并不考虑以“系统发育”方式进展。现有的深层面孔可能与另一个脸交换。可以多次执行面部交换过程,并且可以演变出最终的深层效果,以使DeepFake检测算法混淆。此外,许多数据库不提供应用的生成模型作为目标标签。模型归因通过提供有关所使用的生成模型的信息,有助于增强检测结果的解释性。为了使研究界能够解决这些问题,本文提出了Deephy,这是一种新型的DeepFake系统发育数据集,由使用三种不同的一代技术生成的5040个DeepFake视频组成。有840个曾经交换深击的视频,2520个换两次交换深击的视频和1680个换装深击的视频。使用超过30 GB的大小,使用1,352 GB累积内存的18 GPU在1100多个小时内准备了数据库。我们还使用六种DeepFake检测算法在Deephy数据集上展示了基准。结果突出了需要发展深击模型归因的研究,并将过程推广到各种深层生成技术上。该数据库可在以下网址获得:http://iab-rubric.org/deephy-database
translated by 谷歌翻译
在本文中,提出了针对动力学不确定性的机器人操纵器提出的人工延迟阻抗控制器。控制定律将超级扭曲算法(STA)类型的二阶切换控制器通过新颖的广义过滤跟踪误差(GFTE)统一延迟估计(TDE)框架。虽然时间延迟的估计框架可以通过估算不确定的机器人动力学和相互作用力来从状态和控制工作的近期数据中估算不确定的机器人动力学和相互作用力来准确建模机器人动力学,但外部循环中的第二阶切换控制法可以在时间延迟估计的情况下提供稳健性(TDE)由于操纵器动力学的近似而引起的误差。因此,拟议的控制定律试图在机器人最终效应变量之间建立所需的阻抗模型,即在存在不确定性的情况下,在遇到平滑接触力和自由运动期间的力和运动。使用拟议的控制器以及收敛分析的两个链接操纵器的仿真结果显示出验证命题。
translated by 谷歌翻译
文本生成的广泛使用的评估指标要么与更长的文本效果不错,要么无法评估文本质量的所有方面。在本文中,我们引入了一个名为SMART的新指标,以减轻此类限制。具体而言,我们将句子视为匹配的基本单位,而不是代币,并使用句子匹配函数来匹配匹配候选和参考句子。还将候选句子与源文件中的句子进行了比较,以允许接地(例如,事实)评估。我们的结果表明,我们提出的指标与基于模型的匹配函数的系统级相关性优于萨姆瓦尔摘要元评估数据集上的所有竞争指标指标。后者不使用任何神经模型,这在模型开发阶段很有用,在这些阶段,资源可以受到限制且需要快速评估。最后,我们还进行了广泛的分析,表明我们提出的指标与较长的摘要很好地运行,并且对特定模型的偏见较小。
translated by 谷歌翻译